Measuring the purity of a qubit state: entanglement estimation with fully separable 

measurements 
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Given a finite number N of copies of a qubit state we compute the maximum fidelity that can 
be attained using joint-measurement protocols for estimating its purity. We prove that in the 
asymptotic N — * oo limit, separable- measurement protocols can be as efficient as the optimal joint- 
measurement one if classical communication is used. This in turn shows that the optimal estimation 
of the entanglement of a two-qubit state can also be achieved asymptotically with fully separable 
measurements. Thus, quantum memories provide no advantage in this situation. The relationship 
between our global Bayesian approach and the quantum Cramer-Rao bound is also discussed. 



The ultimate goal of quantum state estimation is to 
determine the value of the parameters that fully char- 
acterize a given unknown quantum state. However, in 
practical applications, a partial characterization is often 
all one needs. Thus, e.g., knowing the purity of a qubit 
state or the degree of entanglement of a bipartite state 
may be sufficient to determine whether it can perform 
some particular task — See Ref. for recent experi- 
mental progress on estimating the degree of polarization 
(the purity) of light beams. This paper concerns this 
type of situation. 

To be more specific, assume we are given N identical 
copies of an unknown qubit mixed state p(r) , so that the 
state of the total system is p N (r) = \p{r)} ■ The set 
of all such density matrices {p(r)} can be mapped into 
the Bloch sphere B — {f : r = \f\ < 1} through the 
relation p(f) — (1 + r • <x)/2, where a = (a x ,a y ,a z ) is 
a vector made out of the three standard Pauli matrices. 
Our aim is to estimate the purity, r, as accurately as 
possible by performing suitable measurements on the N 
copies, i.e., on p N (f). This problem can also be viewed 
as the parameter estimation of a depolarizing channel ,3j 
when it is fed with N identical states. 

The estimation protocols are broadly divided into two 
classes depending on the type of measurements they use: 
joint and separable. The former treats the system of N 
qubits as a whole, allowing for the most general mea- 
surements, and leads to the most accurate estimates or, 
equivalently, to the largest fidelity (properly defined be- 
low). The latter, treats each copy separately but classical 
communication can be used in the measurement process. 
This class is particularly important because it is feasible 
with nowadays technology and it offers an economy of 
resources. In this paper we show that for a sufficiently 
large N, separable measurement protocols for purity esti- 
mation can attain the optimal joint-measurement fidelity 
bound. The power of separable measurement protocols 
in achieving optimal performance has also been demon- 
strated in other contexts 0, IE El • 

It has been shown that given N copies of a bipartite 
qubit pure state, |\1/)ab, the optimal protocol for measur- 



ing its entanglement consists in estimating the purity of 
p[f) = tvB{\^) ab{^\), where trg is the partial trace over 
the Hilbert space of party B (see for related work 
on bipartite mixed states). We thus show that for large 
N this entanglement can be optimally estimated by per- 
forming just separable measurements on one party (party 
A in this discussion) of each of the N copies of \$>)ab- 

Though many of our results here concern finite N, spe- 
cial attention is paid to the asymptotic regime, when N 
is large. There are several reasons for this. First, in 
this limit, formulas greatly simplify and usually reveal 
important features of the estimation protocol. Second, 
the asymptotic theory of quantum statistical inference, 
which has become in recent years a very active field in 
mathematical statistics ^(J , deals with problems such as 
the one at hand. Our results give support to some quan- 
tum statistical methods for which only heuristic proofs 
exist; e.g., the applicability of the integrated quantum 
Cramer-Rao bound in the Bayesian approach (which is 
formulated below) [llj . 

In the first part of this paper we obtain the optimal 
joint estimation protocols and the corresponding fidelity 
bounds. In addition to the general case of states in B, 
which was partially addressed in 0, we also discuss the 
situation when the unknown state is constrained to lie 
on the equatorial plane £ of the Bloch sphere B. In the 
second part, we discuss separable measurement proto- 
cols, we prove that they saturate the joint-measurement 
bound asymptotically and we state our conclusions. 

Mathematically, the problem of estimating the purity 
of p(r) can be formulated within the Bayesian frame- 
work as follows (see 0] f° r an alternative approach). 
Let Ho = {Rx} b e the set of estimates of r, each of 
them based on a particular outcome \ °f some gener- 
alized measurement, O, over p N (r). In full generality, 
we assume that such measurement is characterized by 
a Positive Operator Valued Measure (POVM), namely, 
by a set of positive operators O = {O x } that satisfy 
O x = 1 (x can be a continuous variable, in which 
case the sum becomes an integral over x). A separable 
measurement is a particularly interesting instance of a 
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POVM for which each O x is a tensor product of N in- 
dividual operators (usually projectors) each one of them 
acting on p{r). Next, a figure of merit, f(r, R x ), is intro- 
duced as a quantitative way of expressing the quality of 
the purity estimation. Throughout this paper we use 



f(r,R x ) = 2 max 



rR x + y/T^Jl - Rl = r • R X) (1) 



where = 1, i.e., 
fidelity [13| (see also 
where we have defined 



- f(r, R x )]/2 is the standard 
) between p{r) and p(R x n), 
= r/r. Throughout this pa- 
per we refer to f(r,R x ) also as fidelity for short. Its 
values are in the range [0, 1], where unity corresponds to 
perfect determination. It is interesting to note that in 
Uhlmann's geometric representation of the set of density 
matrices as the hemisphere (l/2)§ 3 c R 4 , the function 
D(r,R x ) = (1/2) arccos/(r, R x ) is the geodesic (Bures) 
distance between two sets (two parallel 2-dimensional 
spheres) characterized by the purities r and R x respec- 
tively. 

In the same spirit as in [TH Il6| . we have written 
f(r,Rx) 
a = (\/T — a 
obtained by maximizing 



a scalar product of the two unit vectors 
2 a); a — r, R x . The optimal protocol is 



F(0,K ) = Y. J dpf(r,R x Mp N (r)O x ], 



(2) 



where dp is the prior probability distribution of p(r), and 
we identify the trace as the probability of obtaining the 
outcome x given that the state we measure upon is p N {r). 
Thus, F is the average fidelity. The maximization is over 
the estimator (guessed purity) IZo and the POVM O. 
Using Schwarz inequality the optimal estimator is easily 
seen to be 

= J x ^ r ; V x = J dp rtr[p N (r)O x ], (3) 



X 



and 



F(0) 



max F(0,K 



= Ev / v^v; 



(4) 



We are still left with the task of computing F max = 
max o F(0). 

In this formulation, we need to provide a prior proba- 
bility distribution (prior for short) dp, which encodes our 
initial knowledge about p(f). Here we assume to be com- 
pletely ignorant of both n and r. Our lack of knowledge 
about the former is properly represented with the choice 
dp cx dil (solid angle element), which states that a priori 
n is isotropically distributed on B. Therefore, we write 



dp = - — w(r)dr; 

47T 



dr w(r) = 1. 



(5) 



While there is wide agreement on this respect, the in- 
dependence of the prior is controversial and so far we 
will not stick to any particular choice. Nevertheless, 
it is worth keeping in mind that the hard sphere prior 
w(r) = 3r 2 shows up in the context of entanglement esti- 
mation ^3] j whereas the Bures prior w(r) = (4/7r)r 2 (l — 
r 2 )-i/2 j g mos ^ na tural in connection with distinguisha- 
bility of density matrices |l3t Il4i Il8| . 

We are now in a position to compute F max . Wc 
first assume no constraint on C, thus allowing for the 
most general measurement setup. The density matrix 
p N (f) can be written in a block-diagonal form, where 
each block, PNja{r), transforms with a corresponding 
spin j irreducible representation of SU (2) and a (a = 
1,2, ...,n.j) labels the different rij occurrences of the 
same block 0,^^. This implies that each element, O x , 
of the optimal POVM can be likewise chosen to have the 
same block-diagonal structure. 

Given a POVM O of this type, we consider the two- 
stage measurement protocol O consisting of (?) a 'prelim- 
inary' measurement of the projection of the state p N (r) 
onto the SU(2) irreducible subspaces, followed by (ii) the 
measurement defined by O. The outcomes of O are thus 
labeled by three indexes x = (i> a >£)> and the corre- 
sponding operators are defined by Oj a £ = tj a 0^tj a . 
Since the projector on each irreducible subspace, lj Q = 
li m i a )(j TO ; a li commutes with p N (r), the probabil- 
ities ti[p N (f) Oj] are the marginals of tr[p N (f) Oj a {\ and 
the fidelity cannot decrease by using O instead of the 
original O. In our quest for optimality, we thus stick to 
these two-stage measurements. 

We next recall that p(f) = U p(rz)U^ for a suitable 
SU(2) transformation U, where z is the unit vector along 
the z axis, and that dil can be replaced by the Haar mea- 
sure of SU(2). Using Schur's lemma the integral in © 
gives 



Vjai - 



2j + l 



drw(r)rtr[p Nja (rz)}. (6) 



R'jaZ turns out to be inde- 



Hence, the estimate i?° pt 
pendent of the outcomes £ (of O), and we can write R°^ 
instead. This, in turn, renders the maximization in (0J 
trivial, since, using the relation Oj a ^ = lj Q , we see 
that the right hand side of becomes also independent 
of O, and we can drop the subscript £ from now on. 

The bottom line is that, assuming an isotropic prior, 
the optimal purity estimation is entirely based on the 
outcomes of X (no additional information about the pu- 
rity can be extracted from the state) and we might as 
well choose not to perform any further measurement 
({Oj} — > 1). With this choice, the prefactor in © be- 
comes unity. Since the rij spin j blocks p^ja all give an 
identical contribution 
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FIG. 1: A log-linear plot of N(l - F max ) in terms of the 
number N of copies for the optimal joint measurement and 
for the Bures (solid line) and hard sphere (dashed line) priors. 



where p r = (1 — r)/2, q r = l—p r , the left hand side of © 
can be simply called Vj , 

The maximal fidelity is thus given by 



F" 



N 



2j + l 



V; 



(8) 



where the coefficient in front of the sum is rij [111 Il9| . 
This, along with 10 and @, provides an explicit ex- 
pression of i^ max . For large N, this can be com- 
puted to be 



F max = l-^+ (7V- 1 ). 



(9) 



One can also check that at leading order in 1/N the op- 
timal guess is i?° pt = 2j /N, as one would intuitively ex- 
pect. These asymptotic results hold for any prior w(r). 

In Fig. HI we plot N(l - F maK ) as a function of N 
in the range 10-5000 for states in B and for the Burcs 
(solid line) and the hard sphere (dashed line) priors. The 
two lines are seen to approach the asymptotic value 1/2 
[which can be read off from Eq. @ ] for large N at a 
similar rate. 

It is also interesting to analyze the case where f is 
known to lie on the equatorial plane £. With this 
information, the prior probability distribution becomes 
dp = (d(f> 1 1 2n)w(r)dr , where 4> is the polar angle of the 
spherical coordinates. Though it is still possible to use 
the block-diagonal decomposition discussed above, the 
individual blocks are now reducible under the unitary 
symmetry transformations on £, i.e., under a U(l) sub- 
group of SU{2). In full analogy to the general case, the 
optimal POVM is given by the set of one-dimensional 
projectors over the £/(l)-invariant subspaces, {tj am = 
|jm; a)(jm; a\}, and, as above, the equivalent representa- 
tions, labelled by a, contribute a multiplicative factor rij . 
The analogous of Q) is now 



[pNja(rx) 



I u mm' V 2 / 



Pr 



-+m' 



(10) 



where d^ m , (/3) are the standard Wigner d-matrices [2lJ . 
From l|10|l we can compute Vj TO and F max , as in JSJl, 
where in this case the sum extends over j and to. The 
resulting expression can be evaluated for small N but it 
is not very enlightening. The corresponding plots for the 
analogous of Bures and hard sphere priors are indistin- 
guishable from those in Fig. ^ Far more interesting is 
the large N regime. It turns out that F max is also given 
by © and the optimal guess becomes m independent, 
fl°P* = 2j/N + .... Therefore, we see that the informa- 
tion about ft becomes irrelevant in the asymptotic limit. 

A word regarding quantum statistical inference is in 
order here. It is often argued that the quantum Cramer- 
Rao bound [2^ can be integrated to provide an attain- 
able asymptotic lower bound for some averaged figures 
of merit, such as the fidelity Ours is a so-called one 
parameter problem for which the quantum Cramer-Rao 
bound takes the simple form Vari? > H~ 1 (r)/N, where 
Vari? = ((Rx ~~ (R-x)) 2 ) ^ s the variance of the estimator 
R x , the average is over the outcomes x of a measure- 
ment, H(r) is the quantum information matrix p2^. and 
R x is assumed to be unbiased: {R x ) = r - I n our case 
H{r) — (1 — r 2 )" 1 , and the bound is attainable. This pro- 
vides in turn an attainable asymptotic upper bound for 
the fidelity (JTJ, since (f{r,R x )) « 1 - $H(r) Va,rR + . . .. 
Assuming one can integrate this relations over the whole 
of B (including the region r ~ 1, where H(r) is singular), 
with a weight function given by the prior JSJ, we obtain 
Eq. 13 . Unfortunately, there are only heuristic argu- 
ments supporting this assumption, but so far no rigorous 
proof exists in the literature |23j ]. 

We now abandon the joint protocols to dwell on sep- 
arable measurement strategies for the rest of the paper. 
Here we focus on the asymptotic regime, but some brief 
comments concerning small N can be found in the con- 
clusions. 

In previous work |16| , some of the authors showed that 
the maximum fidelity one can achieve in estimating both 
r and n (full estimation of a qubit mixed state) assuming 
the Bures prior and using tomography behaves as 



rimax -i 



N 3 /' 



+ o(N- 3 / i ), 



(11) 



where £ is a positive constant. The same behavior one 
should expect for our fidelity F max , since the effect of 
the purity estimation is dominant in l|ll|) . This strange 
power law, somehow unexpected on statistical grounds, 
is caused by the behavior of w(r) in a small region 
rwl. Indeed, it is not difficult to convince oneself that 
if w(r) oc (1 — r 2 )~ A w 2(1 — r)~ A for r « 1, one should 
expect 1-F max cx iV A / 2 - 1 + . . ., for < A < 1 (for A = 0, 
hard sphere prior, one should expect logarithmic correc- 
tions). This differs drastically from © which, as stated 
above, holds for any such values of A. Would classical 
communication be enough to restore the right power law 
A^ 1 for 1 — ^ max an d ; moreover, saturate the bound of 
the optimal joint protocol? 

On quantum statistical grounds, one should expect 
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a positive answer to this question since the quantum 
Cramer-Rao bound is attained by a separable protocol 
consisting in performing the (von Neumann) measure- 
ments M. = {(l±n-er)/2} on each copy. Note, however, 
that M depends on ft = f/r, which is, of course, un- 
known d priori. This protocol can only make sense if 
we are ready to spend a fraction of the N copies of p{f) 
to obtain an estimate of ft, use this classical information 
to design M. and, finally, perform this adapted measure- 
ment on the remaining copies. This protocol was success- 
fully applied to pure states by Gill and Massar in |24| . 
We extend it to purity estimation below. 

Let us consider a family of priors of the form 



w(r) = 



4 r(5/2 - A) 

V* r(i-A) 5 



''(1-r 2 ) 



2\-\ 



(12) 



which includes both the Bures (A = 1/2) and the hard 
sphere (A = 0) metrics. Despite of this particular r de- 
pendence, the final results apply to any prior whose be- 
havior near r = 1 is given by (|12[) . 

We now proceed d la Gill-Massar [24| and consider the 
following one-step adaptive protocol: we take a fraction 
N a = No (0 < a < 1) of the N copies of p(r) and we 
use them to estimate ft. Tomography along the three 
orthogonal axis x, y and z, together with a very elemen- 
tary estimation based on the relative frequencies of the 
outcomes 4]. enables us to estimate n with an accuracy 
given by 



— ~i-<cose r > = -hr-- 



+ o(N T 1 ), (13) 



where O r is the angle between n and its estimate. Here 
and below (• • •) is not only the average over the outcomes 
of this tomography measurements, but also contains an 
integration over the prior angular distribution df2/(47r) 
for fixed r. We see from (|13fl that the pure state limit 
is (0jL>i) ~ 24/(57V ) + . . ., and one can compute the 
fidelity, as defined in jj], to check that it agrees with 
the result therein. This concludes the first step of the 
protocol. 

In a second step, we measure the projection of a along 
the estimated n obtained in the previous step. We per- 
form this von Neumann measurement on each of the re- 
maining N — No = Ni copies of the state p(f). We esti- 
mate the purity to be R — 2N + /N\ — 1, where N±/N\ is 
the relative frequency of ±1 outcomes, and we drop the 
N + dependence of R to simplify the notation. 

Obviously, as a random variable and f or large N \ , R is 
normally distributed as R ~ N(rc r , \Jl — r 2 c 2 / \/Ni) , 
where c r — cos6 r . Hence, for large No and N\ it makes 
sense to expand f(r,R), Eq. around R = rc r , and 
thereafter, because of \1'A\ . expand the resulting expres- 
sion around c r = 1. We obtain 



1-r 2 



ANx 8 



(14) 



where F(r) is the average fidelity for fixed r, i.e. 



/ drw(r)F(r) = F. In view of ( 04 ^ 



N~ 2a . Hence, the two terms in parenthesis in (|14fl can 
only be dropped if a > 1/2. Provided w(r) vanishes as 
in (|12fl with A < 0, we can integrate r in (|14|1 over the 
unit interval to obtain 



F = 1 - 



1 



■o(N-i), 



(15) 



2N(1-N a - 1 ) 

and we conclude that this protocol attains asymptotically 
the joint-measurement bound ©. 

However, most of the physically interesting priors [TtI 
IT^ . w(r), not only do not vanish as r — > 1, but often 
diverge like (|12|l with < A < 1. In this case (|14f) cannot 
be integrated, as the last term does not lead to a con- 
vergent integral. This signals that the series expansion 
around c r = 1 leading to (|14|) is not legitimated in the 
whole of B. 

To fix the problem, we split B in two regions. A sphere 
of radius 1 — e, e > 0, which we call B l , and a spherical 
sheet of thickness e: B u — \r : 1 — e < r < 1}. The 
fidelity can thus be written as the sum of the correspond- 
ing two contributions: F = F 1 + F u . While F l can be 
obtained by simply integrating l|14|) over B 1 , where this 
expansion is valid, some care must be taken in the re- 
gion B 11 . There, we proceed as follows. 

We compute the fidelity as if all the states in B u had 
the lowest possible purity (r = 1 — e) when the first-step 
tomography was performed. This leads to a lower bound 
for F n , because the lower the purity of a state the less 
accurately n can be determined [see Eq. (jT3|> ]. and hence, 
the worse its purity can be estimated in the second step. 
The trick, which amounts to replacing c r by ci_ e , enables 
us to perform the r-integration prior to (•••). We simply 
expand f(r, R), Eq. Q), around R = rc\- t to obtain 



F(r) 



> 



<(l-r 2 )(l-r 2 c 2 _ e ) 



1-r 2 



1 



2Ni V 1 - r 2 cf 



(16) 



where the dots stand for additional terms that are irrele- 
vant to the problem we are addressing here. Integrating 
this expression and expanding around ci_ e = 1 we obtain 

drw(r)F(r) > 1 - JL _ fc A ((1 _ Cl _ e ) 2 - A ) 
1 ^ 



-V-wJJo*^*-' (17) 

where k x = 2 2 - A r(| - A)r(| - A)T(A - 2)/[?rr(l - A)]. 
Putting together the different pieces of the calculation 
we have 

F >l-^--2 A - 2 fc A (9 2 _ £ ) 2 - A + ..., (18) 

< A < 1, where now we can safely take the limit e — > 0. 
We see that by choosing 



N n 



max < — , 



1 1 



2' 2- A 



< a < 1 



(19) 
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we ensure that the joint-measurement bound © is at- 
tained. It is worth emphasizing that the last term in (|18f) . 
which is completely missing in l|15|l , is actually the dom- 
inant contribution if a < 1/(2 — A). For A = we have 



F 



hard 



> i 



l 

2^7 



3(e;)io g (9|) 



(20) 



and we again conclude that the protocol presented here 
attains the joint-measurement bound. 

Two comments about the choice of a are in order. 
First, numerical simulations show that the optimal value 
of a is very close to the lower bound in (|19|l . Second, we 
see that the lower bound in ifl^l) increases with increas- 
ing A. This can be understood by recalling that for large 
N, the estimated purity R is normally distributed with a 
variance of Var i? = (1 — r 2 c^.)/Ni. For A <C 1, the prior is 
a rather flat function of r and, on average, Var R = a/Ni, 
where a is a constant. Increasing the accuracy by which n 
is determined does not improve significantly the estima- 
tion of r. Hence, using a small fraction of the number of 
copies at the first stage of the protocol should be enough. 
This suggest that a must be relatively small. In contrast, 
for A w 1 the prior peaks at r = 1 and Vari? ~ <d 2 /Ni. 
Hence, it pays to spend a large fraction of N to estimate 
n with high accuracy (as this drastically reduces Var R) , 
for which we need that awl. 

At this point one may wonder if the conclusions above 
depend upon our particular choice of figure of merit. To 
get a grasp on this, it is worth using again the standard 
pointwise approach to quantum statistics. There, one 
is interested in the mean square error MSEi? = ((R — 
r) ) for fixed r, where now the average (• • •) is over the 
outcomes of all measurements for a fixed r. One can 
write MSEi? = Vai R+{{R) — r) 2 , where the second term 
is the bias. Using the same one-step adaptive protocol 
described above, we get that the mean square error after 
step two is 



MSE R - 



-<e*> 



(21) 



Ni 4 

As above, the last term can be dropped if a > 1/2, and 
H(r)- 1 



MSEi? = 



V 



o[N~ l ], 



(22) 



saturating the quantum Cramer-Rao bound. This pro- 
tocol is, therefore, also asymptotically optimal in the 



present context. Though the argumentation above is 
somehow heuristic, it can be made fully rigorous |25|. 

In summary, we have addressed the problem of op- 
timally estimating the purity of a qubit state of which 
N identical copies are available. The optimal estima- 
tion of the entanglement of a bipartite qubit state can be 
reduced to this problem. Though the absolute bounds 
for the average fidelity involve joint measurements, these 
bounds can be obtained asymptotically with separable 
measurements. This requires classical communication 
among the sequential von Neumann measurements per- 
formed on each of the N individual copies of the state. 
This result, which has been speculated on quantum sta- 
tistical grounds, is here proved for the first time by a di- 
rect calculation. This leads to a very surprising result: in 
the asymptotic limit of many copies, bipartite entangle- 
ment, a genuinely non-local property, can be optimally 
estimated by performing fully separable measurements. 
This meaning that measurements can be performed not 
only on copies of one of the two entangled parties, but on 
each of these copies separately. This avoids the necessity 
of quantum memories. 

For finite (but otherwise arbitrary) N, finding the op- 
timal separable measurement protocol is an open prob- 
lem. Interestingly enough, a 'greedy' protocol designed 
to be optimal at each measurement step 0, Q leads to 
an unacceptably poor estimation. Notice that in the one- 
step adaptive protocol described above, part of the copies 
were spent ('wasted' from a 'greedy' point of view) in es- 
timating ft. We have seen that this strategy pays in the 
long run. However, the 'greedy' strategy optimizes mea- 
surements in the short run, which translates into measur- 
ing a along the same arbitrarily fixed axis on each copy 
of p(r). This yields a low value for the fidelity, which docs 
not even converge to unity in the strict limit N — > oo. 
This counterintuitive behavior of the 'greedy' protocol 
also appears in other contexts as, e.g., economics, biol- 
ogy or social sciences (see [2(| for a nice example). 
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